Address failing _check_disk_space() when path doesn't exist yet#1692
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
Hi @martinbrose thanks for looking into this! I think the problem is not with the confusion between absolute or relative paths. In theory Here is what I think we should do:
This would give an implementation looking like this: target_dir = Path(target_dir) # format as `Path`
for path in [target_dir] + list(target_dir.parents): # first check target_dir, then each parents one by one
try:
target_dir_free = shutil.disk_usage(path).free
if target_dir_free < expected_size:
warnings.warn(
"Not enough free disk space to download the file. "
f"The expected file size is: {expected_size / 1e6:.2f} MB. "
f"The target location {target_dir} only has {target_dir_free / 1e6:.2f} MB free disk space."
)
return
except OSError: # raise on anything: file does not exist or space disk cannot be checked
pass # never raise. It's best to check parent folder insteadCorresponding tests could be extended (1 for missing folder, 1 for relative path). Do you want to take care of it @martinbrose ? Thanks in advance! |
|
HI @Wauplin, thanks for your reply. You seem to be right... just double checked myself. I'll have a look tonight and update the PR tonight. |
|
Hi @Wauplin, your code recommendation makes sense... implemented and tested it. I also added the two additional test cases. While I'm kind of confident with the test case for the non-existent path, I'm not sure I understand exactly the ask for a test case for relative paths. Maybe I addressed it, maybe not. Please let me know. |
| def test_disk_usage_warning_with_non_existent_path(self) -> None: | ||
| # Test for not existent path | ||
| with warnings.catch_warnings(record=True) as w: | ||
| # Cause all warnings to always be triggered. | ||
| warnings.simplefilter("always") | ||
| _check_disk_space(expected_size=self.expected_size, target_dir="not_existent_path") | ||
| assert len(w) == 0 | ||
|
|
||
| # Test for relative path | ||
| with warnings.catch_warnings(record=True) as w: | ||
| # Cause all warnings to always be triggered. | ||
| warnings.simplefilter("always") | ||
| _check_disk_space(expected_size=self.expected_size, target_dir="./not_existent_path") | ||
| assert len(w) == 0 |
There was a problem hiding this comment.
Yes that's exactly how I was seeing it except that 1 path must be relative (let's say "path/to/not_existent_path") and the other one absolute ("/path/to/not_existent_path") to check that both of them doesn't raise an exception.
There was a problem hiding this comment.
| def test_disk_usage_warning_with_non_existent_path(self) -> None: | |
| # Test for not existent path | |
| with warnings.catch_warnings(record=True) as w: | |
| # Cause all warnings to always be triggered. | |
| warnings.simplefilter("always") | |
| _check_disk_space(expected_size=self.expected_size, target_dir="not_existent_path") | |
| assert len(w) == 0 | |
| # Test for relative path | |
| with warnings.catch_warnings(record=True) as w: | |
| # Cause all warnings to always be triggered. | |
| warnings.simplefilter("always") | |
| _check_disk_space(expected_size=self.expected_size, target_dir="./not_existent_path") | |
| assert len(w) == 0 | |
| def test_disk_usage_warning_with_non_existent_path(self) -> None: | |
| # Test for not existent (absolute) path | |
| with warnings.catch_warnings(record=True) as w: | |
| # Cause all warnings to always be triggered. | |
| warnings.simplefilter("always") | |
| _check_disk_space(expected_size=self.expected_size, target_dir="path/to/not_existent_path") | |
| assert len(w) == 0 | |
| # Test for not existent (relative) path | |
| with warnings.catch_warnings(record=True) as w: | |
| # Cause all warnings to always be triggered. | |
| warnings.simplefilter("always") | |
| _check_disk_space(expected_size=self.expected_size, target_dir="/path/to/not_existent_path") | |
| assert len(w) == 0 |
Wauplin
left a comment
There was a problem hiding this comment.
Thanks for making the changes @martinbrose! Test looks good. I've updated it to have both an absolute and a relative paths but that's it. We are good to merge now :)
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1692 +/- ##
==========================================
- Coverage 82.41% 81.96% -0.45%
==========================================
Files 62 62
Lines 7194 7198 +4
==========================================
- Hits 5929 5900 -29
- Misses 1265 1298 +33
☔ View full report in Codecov by Sentry. |
* Add realpath to disk_usage to allow path-like obj * Add logic to address non-existent paths * Add test cased for non-existent paths * Update tests/test_file_download.py --------- Co-authored-by: Lucain <lucainp@gmail.com>
Address issue #1690.
_check_disk_spacefails when only provided with incomplete paths likecheckpoints/stabilityai/stable-diffusion-xl-base-1.0and then tries to asses exactly this "path".It seems to be a regular case when
snapshot_downloadand hencehf_hub_downloadis being called without the parameterlocal_dir.snapshot_download.py(248:250) only performsos.path.realpath()whenlocal_diris not None:My change in this PR now addresses this case and ensures that the real complete path is being used in
shutil.disk_usage().freewithin_check_disk_space().